DDAffinity: predicting the changes in binding affinity of multiple point mutations using protein 3D structure

Abstract Motivation Mutations are the crucial driving force for biological evolution as they can disrupt protein stability and protein–protein interactions which have notable impacts on protein structure, function, and expression. However, existing computational methods for protein mutation effects prediction are generally limited to single point mutations with global dependencies, and do not systematically take into account the local and global synergistic epistasis inherent in multiple point mutations. Results To this end, we propose a novel spatial and sequential message passing neural network, named DDAffinity, to predict the changes in binding affinity caused by multiple point mutations based on protein 3D structures. Specifically, instead of being on the whole protein, we perform message passing on the k-nearest neighbor residue graphs to extract pocket features of the protein 3D structures. Furthermore, to learn global topological features, a two-step additive Gaussian noising strategy during training is applied to blur out local details of protein geometry. We evaluate DDAffinity on benchmark datasets and external validation datasets. Overall, the predictive performance of DDAffinity is significantly improved compared with state-of-the-art baselines on multiple point mutations, including end-to-end and pre-training based methods. The ablation studies indicate the reasonable design of all components of DDAffinity. In addition, applications in nonredundant blind testing, predicting mutation effects of SARS-CoV-2 RBD variants, and optimizing human antibody against SARS-CoV-2 illustrate the effectiveness of DDAffinity. Availability and implementation DDAffinity is available at https://github.com/ak422/DDAffinity.


Introduction
Mutations, i.e. protein amino acid substitutions, play a crucial role in natural evolution by disrupting protein stability and protein-protein interactions (PPIs) as they can severely affect protein structure, function, expression, and solubility (Pires et al. 2014).A representative example is variants in receptorbinding domain (RBD) of the spike protein of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which affects its binding to the human angiotensin-converting enzyme 2 (ACE2) receptor and its recognition by human antibodies (Starr et al. 2022).Mutations in ACE2 are also used for engineering high-affinity therapeutic decoy receptor candidates (Chan et al. 2020).Therefore, understanding how mutations affect the changes in binding affinity is crucial for the development and optimization of a wide range of biotechnology products, including antibodies, industrial-grade enzymes, and other protein-based therapeutics and reagents (Rouet et al. 2014, Goldenzweig andFleishman 2018).
Binding affinity, which spans more than nine orders of magnitude, is used to measure the binding strength between PPIs (Erijman et al. 2014).Precisely distinguishing the changes in binding affinity (i.e.ΔΔG) upon mutations is an important but still elusive and unsolved challenge in protein engineering.Much research has been devoted to developing computational tools, including traditional methods, machine learning and deep learning approaches.
Traditional methods, such as PoPMuSiC (Kwasigroch et al. 2002) and BeAtMuSiC (Dehouck et al. 2013), generally aim to understand the effects of single point mutations by comparing the statistical differences between wild-type and mutant.These traditional methods, also like Rosetta (Alford et al. 2017, Leman et al. 2020), FoldX (Delgado et al. 2019), etc., mainly use physical energy features to describe mutation effects, which often lead to increased computational costs associated with large-scale conformation space sampling and energy optimization.
Aside from traditional methods, machine learning approaches have been developed to predict protein mutation effects, which largely rely on feature engineering summarized as physicochemical, biophysical, and quantum mechanical characteristics of proteins.Two typical approaches for predicting the effects of single point mutations are mCSM (Pires et al. 2014) and INPS (Fariselli et al. 2015), where mCSM is based on graph-based features constructed from Cutoff Scanning Matrix (Pires et al. 2011), pharmacophore counts, and experimental conditions, while INPS constructs SVM regressions based on protein sequences.To represent the effects of multiple point mutations, mmCSM-PPI (Rodrigues et al. 2021) calculates the sum and average feature values of six categories, including dynamics, residue conservation, and residue environment properties, etc.Although widely used, the performance of machine learning methods heavily relies on expert knowledge, which limits their development with the rapid growth of available protein 3D structure data.
In recent years, the rapid accumulation of high resolution protein structures, aided by experimental methods such as X-ray crystallography, nuclear magnetic resonance (NMR), and cryo-electron microscopy (cryo-EM), has facilitated the widespread application of deep learning-based methods in the fields of protein mutation effects prediction.These approaches can be divided into end-to-end and pre-training based methods.End-to-end models typically incorporate wild-type and mutant protein structures as inputs.For example, DeepDDG (Cao et al. 2019) shares neural network parameters between each target-neighbor residue pair and considers the local environment of the mutant residues.Analysis indicates that the solvent accessible surface area (SASA) of the mutated residue, which determines the buried hydrophobic area, is a major determinant of protein stability.With mutant structures sampled by the Rosetta cartesian_ddg program (Park et al. 2016, Frenz et al. 2020), DDGPred (Shan et al. 2022) integrates inter-residue interaction features using geometric attention and physical energy terms to predict the changes in binding affinity with an anti-symmetric network.Taking sequences, structures, and energy features as input, UniBind (Wang et al. 2023) incorporates a dualpath neural network with attention mechanism, multi-task learning, and model ensemble to extract information for prediction of protein mutation effects from diverse biological datasets.
Pre-training based models for predicting protein mutation effects using protein 3D structures generally have the following two main objectives.First, it is how to construct an adaptive encoder to effectively represent the 3D structure of proteins.Second, it is how to design appropriate pretext tasks for large-scale unlabeled datasets (i.e.constructing efficient objective functions to supervise the pre-training process), which requires specialized architectures and large-scale training data, leading to increased computational costs associated with model design and implementation.GeoPPI (Liu et al. 2021) constructs a geometric encoder by incorporating graph attention network (GAT) to capture the structure features of protein complexes at atomic level.They optimize the conformation of amino acid sidechains through self-supervised learning and integrate gradient-boosting trees (GBT) to predict changes in binding affinity.The encoder of RDE-Network (Luo et al. 2023) first generates embeddings for each individual residue and each pair of residues with two multilayer perceptrons (MLPs), then utilizes a self-attention based network to transform these embeddings into hidden representations for prediction.DiffAffinity (Liu et al. 2023) uses a similar encoder network to RDE-Network to transform the structural context of mutation sites with a selfattention based network.The difference lies in the use of the Invariant Point Attention Module (IPA), which is part of the SE(3)-invariant network presented in AlphaFold2 (Jumper et al. 2021).In addition to above mentioned encoder networks, pre-training models typically estimate amino acid sidechain conformations to learn protein representations from large-scale unlabeled protein structures using different self-supervised learning techniques.Overall, these models focus on global dependencies modeling of PPIs and perform well on single point mutations.
However, the aforementioned approaches do not take into account the local and global synergistic epistasis of the covalent interactions as well as noncovalent interactions, which are inherent to the PPIs of multiple point mutations.To this end, we propose a novel distance-dependent and sequence-dependent message passing neural network (MPNN) incorporating biological domain knowledge to learn the PPIs systematically, named DDAffinity, to predict the changes in binding affinity caused by multiple point mutations using protein 3D structure, which is simple yet effective due to incorporating neighborhood message passing and aggregation on protein k-nearest residue graphs.Extensive experiments have been conducted to validate the superior performance of our method.Specifically, on the internal test set, compared with baseline studies on the prediction of multiple point mutation effects, DDAffinity achieves state-of-the-art performances on all evaluation metrics.On the external test set, DDAffinity demonstrates the best performance on the tasks of human antibody optimization, SARS-CoV-2 RBD variants mutation effects prediction, and multiple point mutation nonredundant blind testing as well.
In summary, our contributions are as follows.(i) we formulate protein multiple point mutation effects prediction as a structure representation learning problem on the k-nearest neighbor graph.Compared with existing methods, further dividing MPNN into spatial, sequential and long-range interactions for message passing and aggregation, and combining domain knowledge of PPIs can notably improve the prediction performance.(ii) We propose a novel two-step additive Gaussian noising strategy for the input and backbone atomic coordinates during training with the aim of data diversity and eliminating the inconsistency introduced during the energy optimization of mutant structure generation.(iii) We utilize residue centrality normalization to characterize residue burial, which has been shown significant effects on protein structure and stability.In conclusion, we have integrated domain knowledge that affects protein multiple point mutations into the design of our model, which demonstrates strong predictive performance and generalization on multiple metrics on both internal and external test set in a systematic way.

Problem statement
Given the wild-type protein complex (WT ) and mutant protein complex (MT ), the changes in binding affinity upon mutations is calculated as follows: (1) where ΔG wildÀ type and ΔG mutant are the changes in Gibbs free energy of wild-type and mutant respectively.The sign of ΔΔG indicates whether the mutations are stable or not.Taking WT and MT as input, the global objective function of our approach is to apply a neural network f to approximate the solution of mutation effects prediction problem as follows:

Datasets
The training and validation datasets utilized in this study are derived from SKEMPI2 (Jankauskait_ e et al. 2019), which comprises 345 complexes and 7805 point mutations with experimentally determined ΔΔG values.SKEMPI2 is curated by calculating the average mutation effect for variants reported in multiple experiments, and its 3D structures have been deposited in the Protein Data Bank (PDB).
In addition to the overall SKEMPI2 dataset, two subsets of SKEMPI2 are used as benchmark datasets for comparing the predictive performance of DDAffinity with baseline models, namely S1131 and M1707.According to Xiong et al. (2017), S1131 is a subset with 1131 nonredundant interface single point mutations.M1707 is a subset with 1707 multiple point mutations, including 1337 multiple point mutations plus reverse mutations (Zhang et al. 2020).We also conduct experiment on a multiple point mutation nonredundant blind testing dataset at the mutation-level, which were proposed in mmCSM-PPI (Rodrigues et al. 2021) and also curated from SKEMPI2.We filter out problematic structures, and mark them as M1340 and M595.To estimate the anti-symmetric property, we use the Ssym dataset presented in Pucci et al. (2018), in which the proportion of direct and inverse variations is unbiased.
Furthermore, we also include two external validation datasets for case study.The first consists of 285 single point mutations of the ancestral Wuhan-Hu-1 RBD (PDB ID: 6M0J) (Starr et al. 2022), whose ΔΔG values have been experimentally quantified by deep mutational scanning techniques.The second contains 494 single point mutations of the SARS-CoV-2 RBD (PDB ID: 7FAE), which are ranked according to the predicted ΔΔG values, aiming to precisely identify the top five mutations that exhibit enhanced binding affinity.
In this study, mutant complex structures are generated by searching for rotamer library using BuildModel function of FoldX from wild-type complex crystal structures obtained from the PDB, and then are optimized using Optimize function of FoldX, which does a quick optimization to eliminate Van der Waals' clashes by slightly moving all sidechains.

Model architecture
We combine sequential, spatial and physicochemical information into the k-nearest neighbor residue graph of proteins 3D structure for message passing and aggregation, which is novel and crucial for protein structure modeling, whereas previous works (Ingraham et al. 2019, Dauparas et al. 2022) only consider spatial message passing among residues or atoms.Figure 1 illustrates the composition of each module in DDAffinity.Given the k-nearest residues graph of wild-type and mutant protein complex, respectively, we now present how to design a sharing structure encoder in what follows.

Structure encoder 2.3.1.1 Protein structure graph
A protein complex is made up of multiple polypeptide chains linked together by noncovalent PPIs.We represent protein complex structure containing L residues as a k-nearest neighbors graph of spatially and sequentially aggregated components G ¼ ðV 2 R L×d , E :¼ fE spatial , E sequential gÞ with node features V ¼ fv 1 ; . . .; v L g describing each residue and edge features E ¼ fe ij g j2Nði;kÞ capturing relationships between residue i and residue j, Nði; kÞ is the k-nearest neighbors of residue i.Here, the amino acid type of each residue i is in set f1; . . .; 20g.χ i 2 ½ − π; πÞ denotes the sidechain torsion angles of residue i.The number of χ i ranges from 0 to 4 depending on its amino acid type.In this work, the Euclidean distance reflects the spatial relations between the β-carbon (C β ) coordinates of residues.

Spatial and sequential encoder layer
The design of our proposed structure encoder architecture is inspired by the following two observations.Firstly, epitopes are commonly classified as linear epitopes, where antibodies bind to a contiguous stretch of residues in sequences through peptide bonds, and conformational epitopes, where antibodies bind to discontinuous residues in sequences but are in spatial proximity due to the folding of polypeptide chain (Zhang et al. 2012).Secondly, disulfide bonds are believed to decrease the conformational entropy and provide an increase in stability to the folded protein conformation (Flory 1956).However, the long-range interactions between protein partners, which have been shown to be the most discriminative features for protein binding affinity (Pires et al. 2014), are neglected in GNN models.
To thoroughly encode node features V and edge features E (see Section 2.3.2 for details of features extraction), we propose to adopt ProteinMPNN as the backbone of DDAffinity, which consists of a spatial encoder and a sequential encoder to aggregate V, E, and physicochemical attributes within the knearest neighbor graph derived from the protein 3D structures.Specifically, as shown in Fig. 2, we use three different neighbor residues to construct the k-nearest neighbor graph.(i) Spatial distance k 1 .A residue will be connected to its k 1 -nearest neighbors according to their spatial Euclidean distances.(ii) Sequential distance k 2 .The linear interactions of residues are defined as the sequential distance between the residue r i and its sequence neighbors if their sequential distances are no more than ðk 2 − 1Þ=2.(iii) Long-range distance k 3 .For efficiently capturing those dependencies that are long-range in sequence but local in 3D Euclidean space, neighbors of residue r i are ranked in ascending order according to their Euclidean distances, and discarded if their sequence distances are not greater than ðk 2 − 1Þ=2.After that, we select the k 3 -nearest neighbors from the ordered neighbor list.In summary, k ¼ k 1 þk 2 þk 3 .
Since we utilize weight sharing between the structure encoders of WT and MT , the following takes the structure encoder of WT as an example.Different from ProteinMPNN, Pre-Norm (applied to the input) (Domhan 2018, Wang et al. 2021) where MLPð�Þ is multilayer perceptron of three layers, Catð�Þ denotes the concatenation operation, LNð�Þ is layer normalization, FFNð�Þ is feed-forward network, α ðlÞ is learnable scalar, and λ ¼ 30 is a scaling constant.

For input h V ðlþ1Þ i
and h E ðlÞ ij , the ðlþ1Þth block for edge updates works as follows, Then we obtain edge hidden representation h_E and node hidden representation h_V from the last block output of the structure encoder.

Fusion layer
To prevent gradients vanishing when network going deeper or using max-pooling operation, we add skip connection between the input and output of structure encoder to avoid  DDAffinity: predicting the changes in binding affinity degradation as shown in Fig. 1.For input h V ð0Þ , h V spatial , h V sequential , h E spatial , and h E sequential , the fusion operation works as follows, where Mergeð�Þ represents the merging operation on spatial and sequential edge representations, i.e.

Residue centrality norm layer
Analysis showed that residue burial has significant effects on protein structure and stability (Nisthal et al. 2019).In this study, we propose normalized residue centrality to measure the different contributions among residues to ΔΔG.Residue centrality is defined as the number of C β atoms within a 10 Å distance ball of the query residue's C β atom, exclude itself (McPartlon and Xu 2023).For the residue of missing C β atom, coordinates of C β are calculated using virtual C β coordinates [see Dauparas et al. (2022) for details].
To quantify residue burial, we propose residue centrality normalization to regular the binding affinity difference of each residue.The normalized residue centrality represents its k-nearest neighbor weight in the protein complex, which is defined as, where ĉi is max-norm regularization of residue r i , c j is residue centrality of residue r j , w j is learnable scalar indexed by the k-nearest neighbors, Nði; kÞ is the k-nearest neighbors of residue r i which can be seen as "residue local environment", i 2 fLg¢f1; 2; . . .; Lg, L is the length of clipped residue patch.Formally, the residue centrality norm layer used in our model is defined as, where Rð�Þ denotes residue centrality normalization operation, and Ṽ is the final node representation.

Readout layer
After obtaining the hidden embeddings of residues, the structure encoder applies max-pooling to the node hidden embeddings to obtain a global protein representations.By sharing the same structure encoder architecture between the wildtype complex structure WT and the mutant complex structure MT , we obtain the representations of the wild-type structure and mutant structure, respectively.Then we subtract the representation of the wild-type structure from the mutant representation and feed it into another MLP to predict ΔΔG.To enforce anti-symmetry, we exchange the wild-type and mutant to predict − ΔΔG, and compute ðΔΔG − ð − ΔΔGÞÞ=2 as the final prediction.The network is trained using the MSE loss.

Edge features and node features 2.3.2.1 Residue-pairwise edge features
As for residue r i and its k-nearest neighbor residues r j , the edge feature e ij is defined over the local coordinate frame Ingraham et al. (2019)].First, we decompose e ij into five components: distance, direction, orientation, amino acid-pair, and position as, Then, the edge embedding where L is the patch size, k is the number of nearest neighbors, and d is the dimension of hidden embedding.And last, we derive each component of e ij from O i to O j as follows.
Firstly, we encode the Euclidean distance using Gaussian Basis Kernel function (Gao et al. 2023) to reflect the spatial relation between any atom-pair in the backbone atoms set The detailed steps are as follows, jjX j − X i jj ¼ γ ðai;ajÞ jjx j − x i jjþβ ðai;ajÞ ; (16) where O i defines a local coordinate system at residue r i , e direction ij represents the relative direction of x j in the reference frame of ðx i ; O i Þ, and qð�Þ defines the orientation with quaternion representation of spatial rotation matrix O T i O j .

Results
In this section, we present our experimental setup for training (Section 3.1) and evaluation metrics (Section 3.2) on two tasks: ΔΔG prediction and human antibody optimization.In predicting ΔΔG, we leverage the overall SKEMPI2, S1131, M1707, and M1340 for internal validation, and M595 for nonredundant blind testing (Section 3.3).In our case study, we focus on predicting the mutation effects of SARS-CoV-2 RBD and optimizing human antibody against SARS-CoV-2 tasks (Section 3.4).For anti-symmetric property validation, the results are listed in Supplementary Note S5.In addition, we perform internal experiments to gain better insights of our model (Sections 3.5 and 3.6).

Experimental setup
To fairly evaluate the performance of DDAffinity, we followed the same strategy used in RDE-Network splitting the training and testing sets.Specifically, we used a ten-fold cross-validation to train the model and assessed the performance based on complex-level without protein structure overlapping.Therefore, ten models were trained.
In our experiments, we implemented a three-layer structure encoder with hidden dimension of 128.The k-nearest parameters were set to k 1 ¼ 16, k 2 ¼ 3, and k 3 ¼ 7. The number of Gaussian Basis kernels was set to 16.The training process consisted of up to 100; 000 iterations (55 epochs) using the Adam optimizer with setting β 1 ¼ 0:9, β 2 ¼ 0:999, ɛ ¼ 1×10 − 8 , and a learning rate of 6×10 − 4 .In this work, we used an early-stopping strategy to accelerate the optimization process, that was, to halt the optimization process when the test score of the internal cross-validation for model selection stops improving (patience of 10 validation steps).Validation steps were performed every 1000 training steps to monitor the optimization procedure, and the batch size was set to 32 unless otherwise noted.Dropout ratios for the input embeddings, encoder layers, and fusion layer were set to 0.0, 0.1, and 0.1, respectively.Our models were implemented using PyTorch deep learning framework, and all experiments were running on a single A100 GPU.
When given WT and MT , we selected mutant residues as anchors and clipped WT and MT into residue patches, which are the 256 nearest neighbors of the mutant residues based on C β distances of inter-residues respectively, including the mutant residues.To improve the performance and generalization of DDAffinity, we implemented a two-step additive Gaussian noising strategy for the atomic coordinates of residues.Firstly, the crystallization of wild-type protein is generally not a physiological or an energy-optimal state.To reduce the inconsistencies introduced during the energy optimization process for generating mutant structures, as well as the inherent differences in proteins, the additive Gaussian noise (std ¼ 0:2Å) was combined with all input atomic coordinates, which yields the perturbed backbone dihedrals ðϕ; ψÞ and sidechain dihedrals ðχ ð1Þ ; χ ð2Þ ; χ ð3Þ ; χ ð4Þ Þ.Secondly, inspired by the ideas of ProteinMPNN that can improve predictive performance and make prediction algorithm more robust, we also incorporated Gaussian noise (std ¼ 0:2Å) to the atomic coordinates of protein backbone atom set A ¼ fN; C α ; C; O; C β g.Importantly, this perturbation was conducted without updating the backbone dihedrals and sidechain dihedrals.Additionally, we only implemented above mentioned two-step additive Gaussian noising strategy during training.

Evaluation metrics
In this paper, five evaluation metrics, including Pearson's correlation coefficient (r), Spearman's rank correlation coefficient (q), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Area Under the Receiver Operating Characteristic Curve (AUROC) were used in ΔΔG prediction task, with r as the primary metric.For evaluating the optimization of human antibody task, we utilized Hits@k (Ali et al. 2022) which denotes the ratio of the favorable mutations set K favorable that have been ranked among the top-k test dataset, i.e.Hits@k ¼ jft 2 K favorable j rankðtÞ≤kgj jK favorable j : ( When calculating AUROC, mutations were classified according to the sign of ΔΔG, i.e. whether mutations are stable or not.To measure the anti-symmetry, we calculated r and bias (hdi) between the predicted ΔΔG of the direct (ΔΔG dir ) and inverse (ΔΔG inv ) variations as mentioned in Montanucci et al. (2019).And the hdi is defined as,

Prediction of the changes in binding affinity
To evaluate the performance of DDAffinity, we conducted comparison experiments using 3-fold, 5-fold, and 10-fold cross-validation on SKEMPI2 with eight baselines in four categories.DDAffinity achieves state-of-the-art performances in all evaluation metrics on the multiple point mutation subset.
In 3-fold cross-validation, DDAffinity outperforms other methods across all metrics for the ΔΔG prediction on the single point mutation, multiple point mutation, and overall subset of SKEMPI2.With respect to the main evaluation metric of r, it is 0.624, 0.661, and 0.647, respectively, which are significant higher than baseline methods.In 5-fold crossvalidation, DDAffinity also demonstrates better performance than baseline on multiple point mutation and overall subsets, with r values of 0.654 and 0.658, respectively, higher than DDGPred.In 10-fold cross-validation, DDAffinity achieves r of 0.693, q of 0.640, RMSE of 1.944, MAE of 1.496, and AUROC of 0.812, respectively, surpassing the pre-trained approaches on the multiple point subset.In summary, DDAffinity exhibits strong generalization performance, particularly demonstrating a significant performance advantage for the multiple point mutation prediction.The results are shown in Table 1 and visualized in Supplementary Note S4.Subsequently, we performed complex-level validation on subsets S1131 and M1707 to compare the predictive performance of DDAffinity with eight published models which were introduced in Wang et al. (2023) (Table 2).Our results demonstrate that DDAffinity is able to make predictions with notably higher predictive performance than other recently proposed methods for protein multiple point mutation effects prediction, as well as for the prediction of protein single point mutation effects.
The performance of DDAffinity was further evaluated by performing training on the multiple point mutation dataset M1340 and nonredundant blind testing on M595 at mutation-level, which were two subsets derived from SKEMPI2.DDAffinity also exhibits notable improved predictive performance compared to baselines (Table 3).
Here, we propose a dataset partitioning scheme to minimize overlapping proteins between the testing and training sets, and we compare DDAffinity with the energy-function method FoldX on this dataset.The results are shown in Supplementary Table S1, and indicate that our model outperforms FoldX under the condition of reduced protein fold overlap (see Supplementary Note S1 for more details).
In addition, considering the use of ProteinMPNN as the backbone of DDAffinity, we have compared the performance of DDAffinity with ProteinMPNN.Taking the wild-type structure and the mutant structure as inputs, ProteinMPNN can obtain the score of the wild-type and the mutant, respectively, and subtracting the score of the mutant structure from the score of the wild-type structure gives the change in the score of the mutation.The comparison results on the SKEMPI2 dataset are shown in Supplementary Note S2.The results show that DDAffinity gives better predictions than ProteinMPNN.

Case study
To interpretate the evolution of SARS-CoV-2 and epistasis phenomenon, Starr et al. (2022) performed deep mutational scanning to experimentally quantify the effects of sitesaturation mutagenesis on ACE2 binding to the ancestral Wuhan-Hu-1 RBD (PDB ID: 6M0J).According to Wang et al. (2023), 15 crucial mutation sites have been identified, which result in 285 point-mutations in total.DDAffinity exhibits superior performance with r ¼ 0:456, outperforming all baseline models (Table 4).
In DDGPred (Shan et al. 2022), the authors reported five favorable point-mutations (PDB ID: 7FAE, including TH31W in CDR1, AH53F and NH57L in CDR2, RH103M and LH104F in CDR3) in the complementarity determining regions (CDRs) of the human antibody P36-5D2 against SARS-CoV-2.Thus, the task is to search out these favorable mutations within all 494 mutations of the CDRs.Specifically, these mutations are ranked based on predicted ΔΔG values of DDAffinity, with the most favorable (lowest ΔΔG values) mutation located at the top.As shown in Table 5, DDAffinity successfully identifies three of five favorable mutations, which obtains a best performance of 0.6 Hits@50 score.Specifically, DDAffinity ranks all five favorable mutations in the top 50%, which consistently outperforms all the baseline approaches and indicates its robustness and effectiveness in optimizing human antibodies task.

Ablation study
To demonstrate the added value of individual modules of DDAffinity on multiple point mutations, we conducted the following ablation experiments.We assessed the effects of five parts of the DDAffinity architecture as follows: residue centrality normalization (denoted as RC.), sequential interaction module (denoted as SI.), long-range interaction module (denoted as LR.), additive Gaussian noising strategy for input atomic coordinates (denoted as IA.), additive Gaussian noising strategy for backbone atomic coordinates (denoted as BA.), and the edge features of direction and orientation in local coordinate frame (denoted as DO.).The results are shown in Table 6.
From it, we have the following observations: (i) removing each module will lead to varying degrees of performance degradation, which indicates the positive effect of above components.(ii) For DDAffinity without IA.and BA., we observe a significant decrease in correlation and notable increase in error performance, which demonstrates that the two-step additive Gaussian noising strategy provides powerful generalization ability and robustness due to the ability to focus more on overall topological features rather than fine local structural details.(iii) DDAffinity without RC.performs slightly worse than DDAffinity, indicating the necessity of quantitative residue burial using residue centrality normalization to balance different contributions of residues.(iv) Due to the lack of local interactions in sequence and global interactions in long-range, we also see a performance decline for DDAffinity without SI.and LR.(v) The decrease after removing DO. suggests that edge features of direction and orientation do give a performance boost to the model.In summary, each module demonstrates the reasonable design of our method.Yu et al.

Exploration of the effect of k on model performance
The k is a critical parameter in the construction of k-nearest neighbor residue graphs.In this study, we set k ¼ k 1 þk 2 þk 3 , where k 1 ; k 2 , and k 3 represents the number of spatial nearest residues, sequential nearest residues, and long-range nearest residues, respectively.For all subsets of SKEMPI2, we found that the best performance is achieved when k 1 ¼ 16, k 2 ¼ 3, and k 3 ¼ 7, except for k 1 ¼ 24 and k 3 ¼ 9 on multiple point mutation subset (see Supplementary Note S3 for more details).To balance between the performance and efficiency, we set k 1 ¼ 16, k 2 ¼ 3, and k 3 ¼ 7 as the default hyperparameters setting of our model when comparing with baselines in Section 3.3.By comparing the prediction results of single point mutations with multiple point mutations, we observed that multiple point mutations tend to aggregate residues from more distant than single point mutations.We attribute this to the co-  DDAffinity: predicting the changes in binding affinity i425 evolutionary constraints of multiple point mutations, which may result in more stable protein 3D structures than single point mutations, as described in Section 3.3.

Conclusion
In this paper, we propose DDAffinity based on k-nearest neighbor residue graph, which aggregates information from spatial and sequential MPNN for accurately predicting changes in binding affinity caused by multiple point mutations.Compared with the benchmark studies, DDAffinity achieves high correlation and low error performance on the ΔΔG prediction task, which demonstrates the benefits of the proposed model architecture in correctly capturing the local and global synergistic epistasis property of PPIs in terms of covalent interactions and noncovalent interactions inherent in multiple point mutations.Furthermore, DDAffinity could be well applied for the mutation effects prediction of SARS-CoV-2 RBD variants and the optimization of human antibody against SARS-CoV-2.
Experimental results demonstrate that the large improvement margin is due to incorporate of additive Gaussian noising strategy during training.Ablation study shows that removing each part of DDAffinity will lead to a performance degradation on all performance metrics, which proves the positive effect of our proposed components.In addition, we propose residue centrality normalization as a quantitative substitute for residue burial to measure the different contributions of residues to ΔΔG.In summary, this article offers a novel protein structure encoder framework based on spatial and sequential MPNN, combined with residue centrality normalization and two-step additive Gaussian noising strategy to predict the changes in binding affinity caused by protein amino acid mutations.The in silico experiments show the potential of DDAffinity to be further served as a broad range of applications in protein structure representation learning, as well as in performing a systematic evaluation of protein multiple point mutations effects.
In our future work, we will continue to investigate novel methods to improve and extend the generalization ability and performance of our proposed model.First, we can explore to integrate residue embeddings from protein sequence pre-training language models and/or structure pre-training models.Second, a more efficient weighting function for residue burial needs to be fully explored compared to the present study.Third, since we only consider neighborhood message passing and aggregation of the mutation site in this work, we believe more context information from the protein complex is beneficial.We expect DDAffinity to be highly valuable in the exploration of how multiple point mutations affect binding affinity, protein function, and the role of mutations in diseases.

Figure 1 .
Figure 1.Overview of DDAffinity architecture.(a) Overall procedure of DDAffinity.(b) Detail components of structure encoder.

Figure 2 .
Figure 2. Illustration of k-nearest neighbors aggregation for spatial and sequential encoders.The k 1 (see (a) gray sticks and (b) left) and k 3 -nearest residues (see (a) green balls and (b) right) for spatial encoder, and the k 2 -nearest residues (see (a) gray balls and (b) center) for sequential encoder. i424 and ReZero (applied to residual architectures) (Bachlechner et al. 2021) training strategies are considered to improve training efficiency and convergence speed.Denote h V Thirdly, we map the residue-pair interactions for certain residue and its k-nearest neighbors to τ ðri;rjÞ as follows, τ ðri;rjÞ ¼ Mergeðfr i � 21þr j jj 2 Nði; kÞgÞ;(22)where r i ; r j 2 f0; 1; . . .; 20g with one additional dimension for unknown residue type, and τ ðri;rjÞ 2 f0; 1; . . .; 440g k , then we have residue-pair embeddings e where p ri is sequence position of residue r i , jp rj − p ri j≤32.cðri;rjÞ¼ 1 means residue r i and r j are in the same chain, otherwise c ðri;rjÞ ¼ 0.2.3.2.2 Residue node featuresTo generate node features, we calculate the backbone dihedral angles ðϕ i ; ψ i Þ and sidechain dihedral angles ðχ ð1Þ i ; χ ð2Þ i ; χ ð3Þ i ; χ ð4Þ i Þ of residue r i .Subsequently, we embed these angles onto the torus space, denoted as f sin ; cos g× ðϕ i ; ψ i ; χ

Table 1 .
Evaluation on the SKEMPI2 dataset.a a Bold values indicate the best results.The dash sign indicates that cross-validation is not applicable for the energy function methods.bResultsare fromLiu et al. (2023).cResultsare from released source code.

Table 2 .
Comparison of r at complex-level for S1131 and M1707.a Bold values indicate the best results.The dash sign indicates the results of the corresponding methods are not available.b Results are from Wang et al. (2023).
a c Results are from released source code.

Table 3 .
Performance comparison of nonredundant blind testing on the multiple point mutation dataset M595.

Table 4 .
Pearson's correlation coefficient of mutation effects on binding affinity of SARS-CoV-2 RBD.
a Results are fromLiu et al. (2023).bResultsare from released source code.